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ABSTRACT 

The technique of windowing has been often used in the implementation of the waveform relax- 
ations for solving ODEs or time dependent PDEs. Its efficiency depends upon problem stiffness 
and operator splitting. Using model problems, the estimates for window length and convergence 
rate are derived. The effectiveness of windowing is then investigated for non-stiff and stiff cases 
respectively. It concludes that for the former, windowing is highly recommended when a large 
discrepancy exists between the convergence rate on a time interval and the ones on its subintervals. 
For the latter, windowing does not provide any computational advantage if machine features are 
disregarded. The discussion is supported by experimental results. 
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1. Introduction. The waveform relaxation (WR) method was originally proposed 
for solving ordinary differential equations arising from very large scale integration 
(VLSI) circuit simulation [5] [9]. Unlike conventional timestepping methods, it iter- 
atively partitions a big system into mutually decoupled subsystems, and then solves 
each subsystem independently. Different discretizations and time steps are allowed for 
integrating subsystems. Based on its nature, the method has been proposed as a mul- 
tirate method for sequential computing or a parallel method on advanced computers 

fo- 
under reasonable assumptions for an ODE, the WR iteration has been shown to 
converge superlinearly on finite intervals [1] [5] [8]. The uniform convergence on an 
interval of [0, T] is reached in the exponential norm 

IMk.r •- max |e~^u(t)|, { > 0, 

which implies that, for many problems, the WR iteration will converge much faster 
on short intervals than on longer ones. In order to accelerate the convergence, the 
technique of windowing is recommended, in which the interval of integration is split 
into a series of subintervals, called windows , with iteration taking place on each window 
successively. 

The length of window is of practical important and strongly depends upon problem 
and machine involved. The general guidance for its selection and the way for evaluating 
its effectiveness are relatively unknown even though windowing has been a common 
practice in using the WR method. 

The estimates for time windows of the WR iteration were studied by Leimkuhler 
and Ruehli for RC circuits arising as simplified models of a VLSI interconnect [4]. Finer 
estimates were developed by Leimkuhler for a model linear second-order system [3]. Us- 
ing the speed of splitting and weighted spectral radius of iteration operator, Leimkuhler 
estimated the abscissa of (^-convergence, which then provided a priori estimate for the 
length of a window wherein convergence was approximately geometric with the given 
rate a;. His approach puts emphasize on the qualitative comparisons between splittings. 

In this paper, we focus on the time dependency of the approximation error and 
intimate relation between the WR iteration (or dynamic iteration called in [6]) for 
time dependent problems and the static iteration for corresponding steady-state (or 
static) problems. For certain model problems, it is possible to separate the factor 
that represents early sweeps from the one that dominates asymptotic behavior. Simple 
convergence estimates are therefore obtained. The estimates and the results observed 
in the experiments are compared and shown to have good agreements. Based on these 
estimates, the effectiveness of windowing is discussed for non-stiff and stiff problems 
respectively. General guidance for the use of windowing is concluded in the end. 

2. Waveform relaxation* Using a first-order linear system 
/ 1 \ du 

(1) — + Lu = /, t > 0, u(0) = u 0 , 
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with a given splitting L = M - N, the WR iteration can be illustrated by 

— + Mu (v) = Nu^ + /, t > 0, u (u) (0) = u 0 . 

dt 

It is an iterative process on a space of differential functions. The functions vS \ so 
called “waveforms,” are then discretized for numerical integration. The continuous 
approximation error := u ^ — u satisfies 

( 2 ) e^(t) = Se lv ~ l) {t), t > 0, 

where S is a linear operator on L p (R + ,C n ) (1 < p < oo) depending upon M and N, 
and is called the iteration operator. 

Several convergent splittings for an ODE system were proposed and discussed in 

[3] and [6]. We shall restrict ourselves to splittings that resemble Jacobi splitting and 
Gauss-Seidel splitting on linear systems with time independent coefficients, as described 
by Eq.(l). For simplicity, the space considered in this work is C°°([0, T]> C n ), the space 
of continuous C n -valued functions in [0, T], with ||-|| denotes Z°° norm for space variables: 

IMOII : = max K(0l» 


and HI • 1 1 |t stands for 

IIMIIt : = | max IKOII- 

The notation || • || will also be used as l°° induced matrix and operator norm. The issue 
of time discretization is beyond our consideration, for which the reader may refer to [7]. 

3. Convergence estimates. For certain type of problems or operator splittings, 
the factor representing early phase of iterations and the factor dominating asymptotic 
behavior in the approximation error can be separated. Laplace transform is a convenient 
tool for doing this. Applying Laplace transform to Eq.(2), the error is expressed as 

e< w >(z) = S(z)e( v -'\z) = S v {z)eW(z ), Rez > 0, 


where 

S(z) = (zI + M)~ 1 N 

is the Laplace transform of the convolution kernel of S. Note, .9(0) is the iteration 
operator for the steady-state problem Lu = f corresponding to Eq.(l). 

Theorem 1. Let L be split a s L = M - N with M = dl, d > 0. After v WR 
iterations, the error is bounded by 

||eW(t)|| < 9v(dt) ■ ||^(0)|| • |||e ( 0 ) |||T, t € [0,T], 
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where 


(3) 


,,x r*( w ) , _ “v 

9v(t) = T — = 1 “ e (Ej) = e EtT 


r ( U ) t=0 -• i=v *• 

and r ( (n) is the incomplete T-function. 

Proof. For M = dl, d > 0, 

S v (z) = ((z + d)-'N) v = (z/d+l)~ v S v (0). 
Let 

/„(*):=(*+ l)~\ 

The inverse Laplace transform of f v is 

/.<*) = t — L_ - e-v-'. 

{v - 1)! 

Hence the error in the time domain satisfies 

rdt 


» 


rdt 

(t) = 5 ,,J (0) l / u (r)e (0) (i - t / d)dr. 


Define 


For t € [0, T], 


9v(t ) := f fv( T )dr = t > 0. 

Jo 1 (t>) 


ll e ^ ) (0ll < ll- < ’’ v ( 0 )ll • max ( f f v (r )dr • max |ej 0) (i)|) 

1<*<71 Jo *e[0,T] 


= *.(*)• ||S'(0)||-|||e"»|||T. 

Equation (3) can be easily verified by induction. □ 

It is interesting to examine the bounds given by Theorem 1. Note that (||5 ft '(0)||) 1 / t; 
is time independent and approximates the asymptotic convergence rate either for ob- 
taining the steady-state solution or solutions over long time intervals. Function g v 
represents the time dependency of the error and dominates the convergence behavior 
at early phase of iteration or on short intervals. It is a monotone increasing function, 
bounded by 0 < g v (t) < 1 with 

</„(0) = 0, lira g v {t) = 1~. 

t — ► OO 
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These observations on the error bound agrees with computational experiences that 
the WR iteration converges faster on short intervals than on longer ones, and the 
convergence rates on any time intervals, including infinite interval, are at least as good 
as the one for the static iteration. 

When operator L has constant diagonal, Theorem 1 actually gives an error bound 
for Jacobi WR iteration. The next two theorems will give results for Gauss-Seidel WR 
iteration on model problems. 

Theorem 2. Let L = M - N. Assume that M and N are simultaneously diago- 
nalizable by matrix X , and all eigenvalues of M are positive. Then 

||e w (t)ll < *(<«) ■ l|S”(0)ll ■ co ni(X) ■ |||e' 0 >||| T , < £ [0,T], 

where d is the largest eigenvalue of M and cond{X) = ||X -1 || ■ ||X||. 

Proof. From the assumptions, there are diagonal matrices Am and An, such that 

M = X~'A M X, N = X~ l \ N X, 

and 

A m = {A,(M)}, A ,(M) > 0 for all i. 

In the Laplace domain, 

S’(z) = X-'((zI + A„)-'A„ yx = (X-'\- u "\" n X)(X-'(zA- m ' + I)-’X), 

leading to 

' /v(2/A,(M)) 

e« W = iT(«)X- kz/MM) ) ■ 

Applying inverse Laplace transform, one obtains 

\\e {v) (t)\\ < ||^(0)|| . II*- 1 II ■ max | / f v (r) ^ x tJ e) - r/A t (M))dr| 

j J 

< ||5' V (0)|| • ||eY _1 || ■ max(^(A,(Af)t)^ |x tJ |) • |||e (0) |||T 

j=i 

< g,(dt) ■ ||5*(0)|| • c<md(X) ■ |||e<°>|||r. □ 
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The lexicographic (or forward point) Gauss-Seidel WR iteration on heat equation 
with periodic boundary condition is an example that this theorem can be applied (see 
[10]). When operator L has the form 


(4) 


L = d 


I -B 
-R I ’ 


d > 0, 


a common structure when using a red-black ordering on certain model problems in 
ODEs or in time dependent PDEs, the error in Gauss-Seidel WR has a similar upper 
bound. 


Theorem 3. For operator L represented by Eq.(4), the error after v Gauss-Seidel 
WR iterations, expressed as e^ v \t) T = [e${t) T , tg\t) T ], is bounded by 

(5) l|£ W WII<S2»-,(A)-||S”(0)|M||4 o) |||r, *€[0,71. 


Proof. For Gauss-Seidel splitting, 


M = d 


I 0 
-R I 


, and N = d 


0 B 

0 0 


Laplace transform of the convolution kernel of S can be expressed as 


S(z) = ( zl + M)~ 1 N = 
which yields 

S v {z) 

and 


7m B 

0 (i/d'+l ) 2 B 


P = RB, Rez > 0, 




(*) 


0 f 2v - l (z/d)BP'’~' 1 

0 hv(z/d)P v 


f2v-\(z/d)BP v ~ 1 eg\z) 

f 2v (z/d)P v e < §\z) 


Back to the time domain, 


A v ) 


(t) = 


BP^! 0 dt f 2 v . 1 (r)ef(t-T/d)dT 

P v So hv( T )eg\t - r/d)dr 


Using the properties 


g v+l (t) < g v (t), t > 0, v > 0, 


mox{||5P‘'- 1 ||,||P"||}< 11^(0)11, 
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and 



O B P v ~ 1 

(note, S v (0) = q pv )i the inequality (5) follows immediately. □ 

The discussion above indicates that the error at early stage of the WR iteration is 
controlled by function g v . Simple convergence estimates are therefore derived naturally 
from this function. Given a convergence rate u> and an iteration number u, the length 
of a window of ^-convergence (see [3]) can be estimated by 

(6) T w := max{t : (g m td(v)(dt)) 1/v < w}, 

where mtd(-) is an integer function, defined by the operator splitting. For instance, 
rntd(v) = v and mtd(v ) = 2v — 1 for the cases discussed in Theorem 1 and Theorem 3 
respectively. A useful variant of Eq.(6) is the value 

(7) wr := (g m td(v){dT)Y ^ v , 

which gives an estimated average convergence rate on windows of length T in first v 
sweeps. 


Example 1. Consider the ODE system 

du 

m +L u - 0 ' 

where L — [— 1,2, — 1], a symmetric tridiagonal matrix with 2 and -1 on main and off 
diagonal. This system describes the nodal voltage of a linear resistor-capacitor (RG) 
network [3] [4]. Jacobi WR iteration was performed with randomly generated starting 
function u (0) . The trapezoidal rule was used in the integration with conservative time 
step At = 0.01 for simulating time-continuous iteration. For a given number of sweeps 
v, the observed convergence rates u 0 b s were collected as 


( max max 

l<t'<n «e[0,T] 




«, w (<) 



Figure 1 depicts the graph of for 5 Jacobi iterations ( v = 5) together with 
observed data marked by x’s. Table 1 shows more detailed comparison between the 
observed convergence rate u) Q t >s and estimated rate u>t on interval [0, X 1 ] . The estimates 
TJ s and ut's provided by Eq.(6)-(7) are surprisingly close to the observed data. Al- 
though such a good agreement cannot be predicted in general, reasonable match at 
early phase of iteration should be expected if similar error bounds, as given by these 
theorems, are conjectured. 
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Table 1 

Observed and estimated convergence rate on [ 0 , T] 


V 

Uobs/WT 

T=0.2 

T=0.4 

T=0.6 

H 

II 

p 

oo 

T=1.0 

3 

.2271/. 1994 

.3840/. 3620 

.4920/. 4939 

.5682/. 6006 

.6227/. 6864 

5 

.1536/. 1437 

.2793/. 2691 

.3801/.3783 

.4622/. 4730 

.5295/. 5550 

9 

.0000/. 0927 

.1796/. 1781 

.2555/. 2568 

.3242/. 3292 

.3852/. 3956 


4. Effectiveness of windowing. The theoretical analysis and practical experi- 
ments have revealed that the effectiveness of the WR method depends highly on the 
stiffness of the ODE solved. Not surprisingly, the effectiveness of using windows in the 
WR iteration is also closely related to the stiffness of the system. In this section, we 
investigate their relation, and show how to estimate the efficiency of windowing, which 
then results the general guidance for the implementation of the WR iteration. 

The effectiveness of windowing is discussed in terms of computational cost or op- 
eration counts. Following concepts and notations are needed. 

Let fi be average operation counts on unit windows per sweep. For a given error tol- 
erance e > 0, vr(e) denotes the average number of iterations needed for the convergence 
on windows of length T. Using Theorem 1, ur(e) can be estimated as 

v T {e) := min{u : g mtd(v) (dT ) < e}. 

The error tolerance t will be dropped whenever the context is clear. The total com- 
putations on a window of length T for the convergence is then approximately equal 
to 


C := vj'T g. 

Note that T fi is the average computations on a window of length T per sweep. Let this 
window be split into two subwindows of length T x and T 2 , + T 2 = T , with the WR 

iteration taking place on each of them until the tolerance level is reached one after the 
other. The total cost then satisfies 


v TiT\ii + vt 2 T 2 h < i>t'T g, T' = max{Ti,T 2 }. 

Define C win := vt*T g. C and C wtn will be referred as the average computations on an 
interval of length T without and with windowing respectively. Since vji < vt, we have 

( 8 ) c win < c. 

This indicates that windowing does not introduce extra computation. 

The number of iterations for the convergence on an interval of length T is closely 
related to the stiffness of the ODE involved. This can be seen from the behavior of 
the function g v (see Eq.(3)). For example, an approximation for the case stated in 
Theorem 1 is 


9v(dT) 


(dTf T 

V 7"! 
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leading to 


e 1/Sr (vr/2) 1/a < dT « (e5 T !) 1/iiT < e 1/5T ur. 


That is, the estimated number of iterations v T increases proportional to the parameter 
d, an indicator of the stiffness of the ODE solved. 

4.1. Non-stiff case. Given an interval of length T. Following above arguments, 
the number of the WR iterations for convergence is not too large for non-stiff systems. If 
the interval is split into k windows of equal length T' — T/k, the gain or the percentage 
of the savings of using windows can be measured by 

n C - C w in _ V tT H - kv T ’T'n 

bwtn '■ - c v t Th 

Vj — Vt> 

Vj 

If v T ~ v T ',S wtn « 0, not much computation can be saved by windowing. When 
vt > vt’i using windows, vt — t?T' sweeps of WR on this interval of length T are likely 
to be reduced. 

Table 2 lists the experimental results on Example 1 discussed in Section 3, as well 
as corresponding estimated values. The entries are of the form observed/estimated. 
The numbers listed inside parenthesis in column 2 are average number of iterations 
collected on subintervals of [0, 2] with length T. Windowing was used on intervals 
[0,T], T = 0.5, 1.0, and 2.0 with the window length 0.25. The iteration was terminated 
when the relative error was in the order of e = l.e — 7. Again, the estimates were 
in good agreement with the observed ones. Windowing reduced the computations by 
25-60%. It clearly suggests that windowing is quite efficient in reducing computations 
when a large discrepancy exists between the convergence rate on an interval and the 
ones on its subintervals. 


Table 2 

Effictiveness of windowing ( observed/ estimated ) 


Interval 

fo.ri 

No. of Iterations 
vt( ave. vt)/vt 

Computations 


Without Windowing 

With Windowing 


8 (7) /8 


2/x / 2 pi 

hmm 

laiffl 

10 (9) /10 


3.75 n / 4// 

25 / 20 


13 (12) / 1 3 

13 n / 13 p 

7.25 n / 8fi 

EiWfil 


17 (17) /18 

34 /i / 36 fi 

13.5/^ / 16 n 



Remark. Eq.(8) and the observed data in Table 2 seem to suggest choosing min- 
imum window length, which ironically is equivalent to the step size used in the time 
integration of subsystems. In this situation, the WR method is nothing but a time- 
stepping method for solving ODEs. However, recall that the method is proposed as 
a multirate method in the context of serial computation or a parallel method on ad- 
vanced computers. For the former, it is developed for problems in which the coupling 
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of subsystems is relatively loose and many subsystems allow large integration steps. 
The window length is therefore recommended as the largest step size used in the time 
integration of subsystems. For the later, machine characters such as vector length, 
communication overhead, play important roles. The study in this paper is restricted to 
the mathematical concerns only. 

4.2. Stiff case. In this situation, the WR iteration would take large number of 
iterations to converge in an interval. The convergence rate very likely has entered the 
asymptotic behavior. Estimating it in terms of function g v alone is no longer adequate. 
From the error bounds given by Section 3, the rate of convergence at time t would be 
dominated by 

(fc«w(<i()||s v (o)||) 1 / ”= 3 p (s(o)), 

the spectral radius of S'(O) or the convergence rate of the related static iteration. Since 
/ , (*^(0)) ' s time independent, the convergence rates on any intervals are almost identical, 
so are the numbers of iterations needed for the convergence on those intervals. 

Example 2. Consider the heat equation on the unit square fl = (0, 1) x (0, 1) with 
Dirichlet boundary conditions 

u t — Au = 0, (t , x) e (0, T] x ft 

u = 0, (t,x) € (0, T] x d(l, u(0,x) = u 0 (x), x € ft. 

The equation was first discretized in space, resulting the semi-discrete problem 

(9) ^ + L h U = 0, 

where Lh is the five point difference approximation operator to the Laplacian 



and h is the mesh size of space discretization. The red-black Gauss-Seidel WR was then 
implemented on the system (9) over time interval [0, T]. The iteration was terminated 
when the difference between the nth and v - 1th approximation 

Mlt/w _ f/(-»lll T 

reached the level of truncation error a safe stopping criterion proven by Nevan- 

linna [8]. Table 3 shows that the numbers of iterations needed on different time intervals 
are almost the same, confirming the above arguments. As is discussed, using small win- 
dows for this problem virtually has no mathematical advantage. 
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Table 3 

Number of iterations 


h 

[0,0.125] 

[0,0.25] 

[0,0.5] 

[0,1.0] 

1/8 

16 

17 

17 

17 

1/16 

53 

60 

60 

60 

1/32 

208 

239 

239 

239 


5. Conclusions. In this paper, the convergence estimates such as window length 
and convergence rate are developed using the qualitative comparison between the W R 
iteration and the corresponding static iteration. The effectiveness of windowing tech- 
nique for the WR method is discussed. The results proven in this work and observed 
in the experiments suggest that, for non-stiff ODEs, substantial computations can be 
saved by windowing when a large discrepancy exists between the convergence rate on 
an interval and the ones on its subinterval; while for stiff problems, which are typically 
arisen from time dependent PDEs, windowing has no mathematical advantage. Thus 
for stiff systems, the selection of the window length should be mainly determined by the 
machine features, such as memory capacity, vector length, cache size, communication 
cost, etc. 

Although only a few model problems are considered in this work, similar approach 
could be taken for some generalized problems. The guidance concluded above certainly 
provide helpful information in the implementation of the WR method for wide class of 
applications. 
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